Distributed Join Processing Between Streaming and Stored Big Data Under the Micro-Batch Model

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Beyond Batch Processing: Towards Real-Time and Streaming Big Data

Today, big data is generated from many sources and there is a huge demand for storing, managing, processing, and querying on big data. The MapReduce model and its counterpart open source implementation Hadoop, has proven itself as the de facto solution to big data processing. Hadoop is inherently designed for batch and high throughput processing jobs. Although Hadoop is very suitable for batch ...

متن کامل

k-Means for Streaming and Distributed Big Sparse Data

We provide the first streaming algorithm for computing a provable approximation to the k-means of sparse Big data. Here, sparse Big Data is a set of n vectors in R, where each vector has O(1) non-zeroes entries, and d ≥ n. E.g., adjacency matrix of a graph, web-links, social network, document-terms, or image-features matrices. Our streaming algorithm stores at most logn · k input points in memo...

متن کامل

Intelligent Distributed Processing Methods for Big Data

Motivation Today, “Big Data” is a new information overloading problem in many different areas. Such areas include health cares (e.g., medical records, bioinformatics), e-sciences (e.g., physics, chemistry, and geology), and social sciences (e.g., politics). Thus, as we have various types of feasible data from a number of available sources, it is becoming increasingly more difficult to efficient...

متن کامل

Big data for Natural Language Processing: A streaming approach

Requirements in computational power have grown dramatically in recent years. This is also the case in many language processing tasks, due to the overwhelming and ever increasing amount of textual information that must be processed in a reasonable time frame. This scenario has led to a paradigm shift in the computing architectures and large-scale data processing strategies used in the Natural La...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2019

ISSN: 2169-3536

DOI: 10.1109/access.2019.2904730